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ABSTRACT 

In this paper, we apply the ideas of the matrix column based 
diffusion approach to define a new eigenvector computation 
algorithm of a stationary probability of a Markov chain. 
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1. INTRODUCTION 

In this paper, we assume that the readers are already fa- 
miliar with the idea of the fluid diffusion associated to the 
D-iteration [2] to solve the equation: 

X = P.X + B 

and its application to PageRank equation [3]. 

For the general description of alternative or existing iter- 
ation methods, one may refer to 

2. ALGORITHM DESCRIPTION 
2.1 Notation 

We recall that the D-iteration is defined by the couple 
{P,B) G IR^''^ X IR^ and exploits two state vectors: Hr, 
(history) and _F„ (residual fluid): 

Fo = B (1) 

Fn = (/d- J,„ +PJ>„)fn-l. (2) 

where Id is the identity matrix, Jk a matrix with all entries 
equal to zero except for the fc-th diagonal term: {Jk)kk = 1, 
i„ the n-th node selected for the diffusion and 

n 

Hn = ^J,,ffc.-1 (3) 

fe=l 

= {Id- Ji„(Id- P))Hr.~i + J,„B. (4) 

The diffusion of a node i containing {F„)i — f means the 
following operation: 



Philippe Jacquet 

Alcatel-Lucent Bell Labs 
Route de Villejust 
91620 Nozay, France 

philippe . j acquet ©alcatel- lucent . com 

• (-^71)!+ = /: this defined Hn+i; 

• {F„)i = and {Fn)j+ = / x pji: this defines F„+i. 

Here, we'll use the equation satisfied by the state vectors 
(cf. H): 

+ F„ = P.H„ + B. (5) 

We'll use here the notation P.X meaning the usual matrix- 
vector product P X X. 

We define cr : IR^ -> M, by a{X) = ^i- We define 

the L\ norm, \X\ = X^ili 

Below, we denote by e the normalized unit column vector 
1/7V(1,..,1)'. 

2.2 Assumption 

We will assume in this paper that P is a stochastic matrix 
(we took the notation of the stochastic matrix per column), 
i.e. X^iPii ~ 1 fo'' each i. We define below the optimal 
algorithm to find X such that X = P.X. For the sake of 
simplicity, we will assume that P is irreducible and ergodic, 
so that P" converges to a unique solution. In fact, our 
approach finds a solution as soon as P is stochastic and with 
no empty columns or rows in P, or a bit more generally if 
P".e is convergent. 

2.3 Algorithm description 

We first apply a full product P.e then compensate by sub- 
stracting e so that a{P.e — e) = 0. Then we set Po ~ Pe — e 
and apply the diffusion iteration on (P, Po) (cf. [3], [5]). This 
means that we solve H such that H — P.H + P.e — e and 
X is obtained by _ff -|- e. Note that H + e is not necessarily 
normalized to 1. 

Remark I. The results we present here are in fact inde- 
pendent of the choice of e and we could use any other vector. 

2.4 Some intuition on the optimality 

The benchmark test to existing methods will be addressed 
in a future paper. From the intuition point of view, the D- 
iteration method was initially used to solve X = P.X + B 
when the spectral radius of P is strictly less than 1, so that 
the fluid B converges to zero exponentially. For the problem 
X = P.X, the direct application of the diffusion would keep 
the same amount of fluids in the system and the convergence 
can be obtained by Cesaro averaging (or multiplicative nor- 
malization) , but this would lead a very slow convergence [4] . 
The algorithm above starts by creating an initial vector in 
the kernel of a in which we have positive fluids compensating 



exactly all negative fluids. Then the convergence is obtained 
by the fact that positive fluids meeting negative fluids just 
vanishes. 

Now, intuitively, this is clearly optimal, because with the 
initial setting we are killing all cycles and the diffusion be- 
hind kills all remaining forwarding diffusions. 

3. CONVERGENCE 

Theorem 1. There exists a choice of the sequence of nodes 
such that the diffusion applied on {P,Fo) converges to X — e. 



Proof. Let's first prove that the usual iteration method: 
Hn+i — P.Hn+Fo is convergent. By induction, it is straight- 
forward to obtain: Hn — P".e — e. Since P".e — >■ X, we have 
Hoc = H = X - e. 

For the diffusion process, from the equations ((2| and ((Sj, 
we can easily prove by induction that: a{Fn) — a{Hn) = 0. 
Then we prove that \F„\ is non- increasing function: when 
the diffusion is applied on a positive fluid, the same amount 
is distributed; when they meet negative fluid, a part of them 
vanishes so that \F„ \ is decreased, if not is not modified. 
For a more rigorous proof: set j — in+i and/ = (fn)i„_,.i, 
then: 

l^^n + ll =^|(Fn + l).| + |(Fn + l)i| 

Let's call A the set of nodes i such that {Fn)i has a sign 
opposed to /. Then, 

l^^n + l|=;^(|(F„),| + |/b.,) + |/p,,| 
ie A 

= \FA + ^(|(fn), + fp^,\ - - |/|p,,). 

ie A 

Now, we use \x + y\ < \x\ + \y\ to get lF„+il < 
Therefore, |F„| is convergent. The limit is necessarily equal 
to zero, because of the irreducibility of P: there exists a 
path (diffusion sequence) such that positive and negative 
fluids necessarily meet each others. For all there ex- 
ists n such that {P)i'j > implying there is a path ii = 
i with strictly positive weight w{i,j) = Pii,i2 x 
.. X Pi„_i,i„ > 0. Applying the diffusion successively on the 
nodes i„, ..12 from Fm = F, we are sure to cancel at least 
mm{\{F)^\,w{i,j) X (taking {F)i x {F)j < 0). Now 

taking w = miui^j w{i,j), we are sure to cancel in less than 
TV diffusions at least w x max^ which means an expo- 

nential convergence, and this guarantee the convergence of 

Fin- 

This theorem allows us to apply heuristic policy to opti- 
mize the sequence of nodes for the diffusion. 

4. APPLICATION TO PAGERANK EQUA- 
TION 

The PageRank equation can be written as: 

X = P.X (6) 



where P = dPg + (1 — d)/NJ, where J is the matrix with all 
entries equal to 1 and Pg is the completed matrix from the 
initial Pg (corresponding to the web graph) by e on columns 
associated to dangling nodes (nodes with no outgoing links). 
If we apply the same method than above, we get: 

H = dPgH + d(Pg.e-e). 

We can rewrite d{Pg.e — e) as Fq = dPg.e — da{Pg.e)e. 

In this case, because of the factor d and the presence of 
dangling nodes, the iteration does not maintain H„,Fn in 
the kernel of a. However, the system is dominated by a 
system which is exponentially decreasing (d) and it is easy 
to prove its convergence for D-iteration. 

The limit of D-iteration on {dPg, Fo) satisfies: 



H = dPg.H + Fo 



and 



i>0 

= d'P'g.[{dPg.e -e) + {l-d + dfi)e] 
= -e + ^d*P;.(l-d + d/i)e, 

i>0 

where /i = 1 — a{Pg.e). 

The limit of D-iteration on {dPg,Fo) satisfies: 

H' = dPg.H' + Fo 

= dPg.H' + da{H' - Pg.H')e + Fo 

= dPg.H' + {dPg.e - e) 4- (1 - d + d/i + d/2)e 

= dPg.H' + {dPg.e -e) + {l-d + df)e 

where /a = a{H' - Pg.H') and f = fi + /2, and 

i>0 

= Yd'Pl.[{dPg.e~e) + {l~d + df)e] 

i>0 

= -e + J2 d'Pg-i^ -d + df)e. 

Therefore, we have: 

X = H' + e 

1-d + df 



H + e) 



l-d + dfi 

Then X can be computed from H by normalizing H + e to 



5. EXTENSION TO POSITIVE MATRIX 

Now, if P is a positive irreducible matrix, Perron-Frobenius 
theorem says that (p~^P)".e is convergent (p is the spectral 
radius of P) to the unique eigenvector which is strictly pos- 
itive. Therefore, to solve P.X — pX, we can apply the 
method above and solve H such that H = P' .H + P.e — e 
with P' = p-^P. We would have Hn = P'".e - e which 
converges to X — e. 

Theorem 2. There exists a choice of the sequence of nodes 
such that the diffusion applied on (P',P'.e — e) converges to 
X -e. 



N 


L/N 


D/N 


E/N 


O/N 




maxout 


10^ 


12.9 


0.041 


0.032 


0.236 


716 


130 


10* 


12.5 


0.008 


0.145 


0.114 


7982 


751 


10^ 


31.4 


0.027 


0.016 


0.175 


34764 


3782 


10^ 


41.2 


0.046 





0.204 


403441 


4655 



Table 1: Extracted graph: N = 10^ to lO". 



Proof. The proof is similar to the stochastic matrix case, 
except that we replace the Li norm by the norm \X\v ~ 

\xi X Vi\ where V is the left eigenvector of P' (which has 
all entries strictly positive). Then, _F„ is such that av{F„) — 

Xi X Vi — and F„ is non-increasing function for the 
norm \.\v- Of course, the beauty of this result is that we 
don't need the explicit expression of V. 

6. FIRST CONVERGENCE COMPARISON 

6.1 Error 

For the PageRank equation we will consider below, the 
distance to the limit is computed as follows: 

• for the power iteration (PI) , the distance to the limit is 
bounded by — X„| x d/{l — d); if d = 1, we use an 
estimate of the distance to the limit by |Xn-i-i — X„\; 

• for the initial D-iteration method (DI), the distance 
to the limit is given exactly by the Li norm of the 
residual fluid divided by 1 — d: |F,i|/(l — d); 

• for DI+, the distance to the limit is bounded by |Jn|/(l — 
d); for d — 1, we use an estimate of the distance to the 
limit by |X„+i - X„\. 

6.2 Data set 

For the evaluation purpose, we used the web graph im- 
ported from the dataset uk-2007-05(§ 1000000 (available on 
PP) which has 41,247,159 links on 1,000,000 nodes. 

Below we vary A'^ from 10"^ to lO'' extracting from the 
dataset the information on the first A'' nodes. Some graph 
properties are summarized in Table [T] 

• L: number of non-null entries (links) of P; 

• D: number of dangling nodes (0 out-degree nodes); 

• E: rmmbcr of in-degree nodes: the in-degree nodes 
are defined recursively: a node i, having incoming links 
from nodes that are all in-degree nodes, is also a 
in-degree node; from the diffusion point of view, those 
nodes are those who converged exactly in finite steps; 

• O: number of loop nodes (pu ^ 0); 

• maxin = maxi (maximum in-degree, the in-degree 
of i is the number of non-null entries of the i-th line 
vector of P); 

• maxout = max; #outi (maximum out-degree, the out- 
degree of i is the number of non-null entries of the i-th 
column vector of P). 





nb iter 


gain 


time (s) 


gain 


d = 0.5. 


PI 


7 


X 


0.01 


X 


DI 


4.8 


xl.5 





> xl 


m+ 


3.8 


xl.8 





> xl 


d = 0.85. 


PI 


20 
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0.02 


X 


DI 


15.8 


xl.3 





> x2 


DI+ 


11.8 


xl.7 





> x2 


d = 0.99. 


PI 


469 
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0.14 


X 


DI 


202 


x2.3 


0.02 


x7 


DI+ 


193 


x2.4 


0.02 


x7 


d = 0.999. 


PI 


5425 


X 


1.47 


X 


DI 


1379 


x3.9 


0.2 


x7.4 


m+ 


1282 


x4.2 


0.17 


x8.6 


d=i. 


PI 


49 


X 


0.03 


X 


DI 


undefined 


X 


undefined 


X 




242 


xO.2 


0.03 


xl 



Table 2: A'^ = 10^: Comparison of the runtime for 
a target error of 1/A'^. Gain: speed-up gain factor 
w.r.t. PI. 



To guarantee the continuity to d = 1 and to meaningfully 
consider all nodes, the above graph has been completed with 
one random outgoing link for all dangling nodes and with 
one random incoming link for all nodes having no incoming 
links. 

Note that for d < 1, there exists a unique solution (sta- 
tionary probability). For d = 1, DI is not defined, PI may 
not converge {P may not be aperiodic) and DI-|- converges 
to the solution which is the limit of d — > 1. 

6.3 Comparison 

For the evaluation of the computation cost, we used Linux 
(Ubuntu) machines: Intel(R) Core(TM)2 CPU, U7600, 1.20GHz, 
cache size 2048 KB (Linuxl, g + -\ — 4.4). The algorithms 
that we evaluated are: 

• PI: Power iteration (equivalent to Jacobi iteration), 
using row vectors; for d = 1, to force the convergence, 
we used relaxation idea with parameter 0.5; 

• DI: D-iteration with node selection, if {F„)i > r„i x 
#outi/L, where ij^outi is the out-degree of i and r„/ is 
computed per cycle n'; 

• DI+: proposed solution. Initialization to P.e — e fol- 
lowed by D-iteration with node selection, if [(-Fn)!! > 
r„/ x ^outi/L, where #outi is the out-degree of i and 
r„i is computed per cycle n' . 



7. CONCLUSION 

We proposed a new algorithm to accelerate the compu- 
tation of the dominant eigenvector of non-negative matrix 
inspired from the D-iteration diffusion vision. To show its 
potential, first evaluation results are included. 
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d = 0.99. 
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120 


x4.5 
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xll 
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x4.6 
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d = 0.999. 


PI 


7739 
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24.4 
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DI 


919 


x8.4 


1.34 


xl8 


DI+ 


880 


x8.8 


1.25 


x20 


d= 1. 


PI 


393 
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1.26 
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DI 


undefined 
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undefined 


X 


DI+ 


53 


x7.4 


0.07 


xl8 



Table 3: = 10^: Comparison of the runtime for 
a target error of 1/A'^. Gain: speed-up gain factor 
w.r.t. PI. 





nb iter 


gain 


time (s) 


gain 


d = 0.5. 


PI 


12 
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1.1 
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DI 
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x2.1 


0.14 


x8 
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x8 


d = 0.85. 


PI 


53 
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4.6 
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17.4 
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0.44 
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16.6 


x3.2 


0.41 
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d = 0.99. 
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834 
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72 
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x5.1 
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DI+ 
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x5.3 
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PI 
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DI 
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x9.2 
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undefined 
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nb iter 
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time (s) 
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x2.2 
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xll 
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undefined 
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undefined 
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x34 


1605 


x75 



Table 5: N = 10 : Comparison of the runtime for 
a target error of 1/A'^. Gain: speed-up gain factor 
w.r.t. PI. 
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